Analysis of Statistical Hypothesis based Learning Mechanism for Faster Crawling
نویسندگان
چکیده
The growth of world-wide-web (WWW) spreads its wings from an intangible quantities of web-pages to a gigantic hub of web information which gradually increases the complexity of crawling process in a search engine. A search engine handles a lot of queries from various parts of this world, and the answers of it solely depend on the knowledge that it gathers by means of crawling. The information sharing becomes a most common habit of the society, and it is done by means of publishing structured, semi-structured and unstructured resources on the web. This social practice leads to an exponential growth of web-resource, and hence it became essential to crawl for continuous updating of web-knowledge and modification of several existing resources in any situation. In this paper one statistical hypothesis based learning mechanism is incorporated for learning the behaviour of crawling speed in different environment of network, and for intelligently control of the speed of crawler. The scaling technique is used to compare the performance proposed method with the standard crawler. The high speed performance is observed after scaling, and the retrieval of relevant web-resource in such a high speed is analysed.
منابع مشابه
Analysis of a Statistical Hypothesis Based Learning Mechanism for Faster crawling
3.4 ,"" BLOCKIN0122$%"01$%0!11"$$ "$ BLOCKIN0," BLOCKIN01 BLOCKIN BLOCKIN$!! !!!1.0120"" BLOCKIN".
متن کاملAnalysis of the No Return Point Hypothesis: The Effect of Audio and Visual Stimuli in the Fast Movements Inhibition
Background. The No Return Point hypothesis is one of the research areas that has been done in line with the motor program. In this hypothesis emphasized an inability to inhibition move after its start by the motor program. Several factors are affecting the mechanism of this inhibition. Objectives. In this study, we investigate the effects of audio and visual stimuli on blocking quick moves to ...
متن کاملPrioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملTask Effectiveness Predictors: Technique Feature Analysis VS. Involvement Load Hypothesis
How deeply a word is processed has long been considered as a crucial factor in the realm of vocabulary acquisition. In literature, two frameworks have been proposed to operationalize the depth of processing, namely the Involvement Load Hypothesis (ILH) and the Technique Feature Analysis (TFA). However, they differ in the way they have operationalized it specially in terms of their attentional c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1208.2808 شماره
صفحات -
تاریخ انتشار 2012